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ABSTRACT 


This paper presents SABC-DQN (Sophisticated Artificial Bee Colony Inspired Deep Q-Networks), a novel 
approach to enhance the sentiment analysis of Coursera course reviews. Sentiment analysis is crucial for 
understanding learner opinions, but existing methods struggle to capture the nuanced sentiments expressed 
in textual data accurately. SABC-DQN combines the intelligent exploration capabilities of Artificial Bee 
Colony (ABC) optimization algorithms with the power of Deep Q-Networks (DQN). The ABC optimization 
algorithms mimic honey bees’ efficient foraging behaviour, enabling effective solution space exploration. 
DQN, areinforcement learning technique, utilizes a deep neural network to learn and approximate the optimal 
policy for sentiment classification. The SABC-DQN approach operates through a multi-step process. 
Initially, the ABC optimization algorithm guides the exploration of the solution space, identifying optimal 
features that capture sentiment-related information. These features are then employed to train the DQN, 
leveraging the representation learning capabilities of the deep neural network to predict sentiment labels 
accurately. Experimental evaluations conducted on a dataset of Coursera course reviews demonstrate the 
efficacy of SABC-DQN in enhancing sentiment analysis. The proposed approach outperforms existing 
methods, achieving superior accuracy, precision, recall, and Fl score. SABC-DQN exhibits robustness when 
faced with variations in review length, domain-specific jargon, and grammatical errors. SABC-DQN 
introduces a novel solution for empowering sentiment analysis of Coursera course reviews. By integrating 
sophisticated Artificial Bee Colony optimization algorithms with Deep Q-Networks, SABC-DQN provides 
an advanced mechanism to capture nuanced sentiments expressed in textual data. The proposed approach has 
the potential to significantly improve sentiment analysis accuracy, facilitating a deeper understanding of 
learner perspectives in online educational platforms. 


Keywords: Artificial Bee Colony, Coursera, Deep O-Networks, SABC-DON, Sentiment Analysis, Textual 

Data Analysis 
1. INTRODUCTION educational content anytime and anywhere, 
leveraging cloud-based storage and mobile 


In the digital era, online learning has emerged as a 
transformative educational approach driven by 
technological advancements and the internet’s 
pervasive reach. With the integration of Learning 
Management Systems (LMS) and adaptive learning 
algorithms, online learning platforms provide 
learners with personalized and adaptive educational 
experiences [1]. Through intelligent algorithms, 
learners receive targeted feedback, adaptive 
assessments, and customized learning paths, 
optimizing their learning outcomes. The flexibility 
of online learning enables learners to access 


applications. Additionally, integrating virtual reality 
(VR) and augmented reality (AR) technologies 
within online learning environments offers 
immersive and interactive experiences, allowing 
learners to engage with realistic simulations and 
practical scenarios [2]. Using big data analytics and 
learning analytics enables educators to gain insights 
into learners’ progress and tailor instructional 
strategies accordingly. Online learning facilitates the 
development of digital literacy and 21st-century 
skills by providing collaboration, critical thinking, 
and problem-solving opportunities through online 
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discussions, group projects, and online assessments. 
Embracing online learning opens up a new era of 
educational possibilities, empowering learners to 
thrive in the digital age [3]. 


In the era of digitalization, our means of 
communication and self-expression have undergone 
a revolutionary transformation, resulting in a 
tremendous volume of textual data. Within this 
expansive realm of text, sentiment analysis has 
emerged as a formidable technique for deciphering 
the underlying sentiments and attitudes conveyed 
[4]. Also referred to as opinion mining, sentiment 
analysis employs computational methods, natural 
language processing, and machine learning 
algorithms to analyze and interpret the emotional 
tone expressed in written text. An _ essential 
application of sentiment analysis lies in managing 
customer experiences. Businesses can glean 
invaluable insights into customer sentiments, 
satisfaction levels, and preferences by scrutinizing 
customer feedback, online reviews, and social media 
interactions. This wealth of information empowers 
companies to make informed decisions, enhance 
their products and services, and deliver unparalleled 
customer experiences [5], [6]. 


Sentiment analysis plays a pivotal role in 
monitoring social media platforms. These virtual 
arenas are fertile ground for capturing public 
opinions and sentiments [7]. By meticulously 
analyzing tweets, posts, comments, and other user- 
generated content, sentiment analysis aids in 
monitoring public sentiment toward brands, events, 
or trending topics. This strategic intelligence equips 
marketers, advertisers, and decision-makers with a 
profound understanding of public perception, 
enabling them to tailor their strategies accordingly 
[8] sentiment. Sentiment analysis also finds fruitful 
applications in political analysis and public opinion 
research. Researchers and policymakers can gauge 
public opinion on specific policies, politicians, or 
social issues by dissecting sentiments expressed in 
news articles, blogs, or social media posts. This 
analytical prowess informs decision-making 
processes, facilitates effective communication 
strategies, and identifies areas of concern. 


Sentiment analysis holds immense value in 
market research and competitive analysis. 
Businesses can acquire insights into consumer 
preferences, sentiments toward competitors, and 
emerging market trends by analyzing online reviews, 
forum discussions, or survey responses [9]. With this 
knowledge, companies can adapt their marketing 


strategies, devise innovative products, and gain a 
competitive edge [10]. As sentiment analysis 
advances, researchers delve into new dimensions, 
such as aspect-based sentiment analysis and emotion 
detection. Aspect-based sentiment analysis strives to 
identify sentiments directed toward specific aspects 
or features of a product or service, unearthing more 
nuanced insights [11]. Emotion detection focuses on 
categorizing the emotions expressed in text, 
unravelling a deeper understanding of user 
experiences and reactions. When _ bio-inspired 
optimization techniques [12]-[16], [17], [18], [27], 
[28], [19]}-[26] are applied to sentiment analysis, the 
objective is to achieve improved accuracy, 
efficiency, or performance compared to traditional 
methods. By leveraging the adaptive and efficient 
strategies inspired by biological systems, bio- 
inspired optimization can potentially enhance 
sentiment analysis algorithms, leading to more 
accurate sentiment classification, better prediction of 
emotions, or improved interpretation of textual data 
[29]. 


1.1. Problem Statement 

The subjectivity and variability of human 
emotions and opinions pose a significant challenge 
in sentiment analysis. Sentiments can be expressed 
differently by different individuals, and they can 
vary based on cultural, demographic, and personal 
factors. Existing sentiment analysis models often 
struggle to handle this subjectivity and variability, 
leading to inconsistencies and biases in sentiment 
classification. This problem necessitates the 
development of robust sentiment analysis models 
that can account for individual differences, cultural 
variations, and demographic influences, resulting in 
more accurate and unbiased sentiment analysis 
across diverse populations. Addressing the challenge 
of subjectivity and variability requires advancements 
in machine learning algorithms and models that can 
effectively capture and represent the diverse range of 
human sentiments and opinions. 


1.2. Motivation 

The motivation behind tackling the 
subjectivity and variability challenges in sentiment 
analysis is to achieve more accurate and unbiased 
classification. By developing robust sentiment 
analysis models that can handle individual 
differences, cultural variations, and demographic 
influences, we can better understand diverse 
perspectives and ensure fair representation of 
sentiments across different populations. This 
motivation stems from the desire to avoid biased 
decision-making, provide inclusive customer 
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experiences, and enable policymakers to gauge 
public sentiment more accurately, leading to 
equitable policies and initiatives. 


1.3. Objective 
Design and implement a robust sentiment 
analysis algorithm to handle the subjectivity and 
variability of human emotions and opinions. This 
objective aims to develop techniques to account for 
individual differences, cultural variations, and 
demographic influences in sentiment analysis. It 
seeks to enable unbiased sentiment classification by 
accurately representing diverse perspectives and 
ensuring fair representation of sentiments across 
different populations. 
e Handle the subjectivity and variability of 
human emotions and opinions. 
e Account for individual differences, cultural 
variations, and demographic influences. 
e Enable unbiased sentiment classification. 


2.0 LITERATURE REVIEW 


“Local and Global Context Focus 
Multilingual Learning Model” [30] is designed to 
handle multilingual data, enabling sentiment 


analysis across different languages. It focuses on 
capturing both the local context, which considers the 
specific aspect and its surrounding words, and the 
global context, which considers the overall 
sentiment patterns and dependencies in the text. By 
incorporating both levels of context, the model 
provides a comprehensive understanding of aspect- 
based sentiment. This approach improves the 
accuracy and robustness of sentiment analysis across 
languages and domains, making it valuable for 
analyzing sentiment in multilingual datasets, such as 
customer reviews, social media content, and opinion 
mining. “Deep Multichannel Neural Networks” [31] 
leverage deep neural networks with multiple 
channels to capture different aspects of sentiment 
information. By incorporating a variational 
information bottleneck, the model learns to extract 
the most relevant and informative features for 
sentiment analysis while minimizing information 
redundancy. This enables the model to compress the 
input data effectively while preserving essential 
sentiment-related information. The deep 
multichannel architecture allows the model to 
capture linguistic aspects, such as word embeddings, 
syntactic structures, and semantic relationships, 
resulting in comprehensive sentiment analysis. This 
approach provides enhanced accuracy and 
interpretability, making it valuable in applications 
like opinion mining, social media sentiment 


analysis, and customer feedback analysis. Bio- 
inspired Optimization are being applied in many 
researches to achieve the best cum expected results. 


“Multimodal Sentiment Analysis” [32] 
consists of two main components: a unimodal 
reinforced Transformer and a time squeeze fusion 
layer. The unimodal reinforced Transformer is used 
to progressively attend to and distil unimodal 
information from the multimodal embedding. The 
time squeeze fusion layer fuses the unimodal 
reinforced embeddings into a final multimodal 
embedding. UR-Transformer has been evaluated on 
the MOSEI, a large-scale multimodal sentiment 
analysis dataset. It has been shown to outperform 
state-of-the-art methods on this dataset. “Innovative 
Sentiment Analysis” [33] focuses on individuals’ 
collective behavior and sentiment patterns. By 
analyzing large-scale social media data or other 
sources of public opinion, this approach identifies 
trends, group dynamics, and the influence of social 
interactions on sentiment. It provides insights into 
the collective sentiment of a community or market, 
enabling a deeper understanding of herd behavior 
and its impact on decision-making processes. This 
innovative sentiment analysis technique has 
applications in finance, marketing, and social 
sciences, offering valuable insights into the 
dynamics of collective sentiment and the potential 
for predicting and understanding group behavior. 


“Weakly Supervised Framework” [34] has 
been developed for aspect-based sentiment analysis 
on students’ reviews of Massive Open Online 
Courses (MOOCs). This framework addresses the 
challenge of limited labelled data by leveraging 
weak supervision, where aspect-level sentiment 
annotations are automatically generated using 
heuristics or distant supervision techniques. The 
framework identifies aspects and associates 
sentiment polarity with them by incorporating 
domain-specific knowledge and linguistic patterns. 
This approach allows for analyzing sentiment 
towards specific aspects mentioned in students’ 
reviews, such as course content, instructors, or 
platform usability. The weakly supervised 
framework enables a more scalable and efficient 
sentiment analysis process, providing valuable 
insights into students’ opinions and sentiments 
regarding different aspects of MOOCs. It can aid in 
improving course offerings, identifying strengths 
and weaknesses, and enhancing students’ overall 
learning experience in online education platforms. 
“Lexicon Embedding and Polar Flipping Technique” 
[35] capture the sequential dependencies of words in 
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understanding of the sentiment expressed. At the 
word level, lexicon embedding incorporates 
sentiment information from pre-defined sentiment 
lexicons. This enhances the model’s capture of 
sentiment nuances and improves — sentiment 
classification accuracy. The polar flipping technique 
further enhances the model by flipping sentiment 
polarities of certain words based on contextual 
usage, ensuring better sentiment representation. The 
Two-Level LSTM with lexicon embedding and 
polar flipping offers an effective solution for 
sentiment analysis, achieving improved 
performance and capturing fine-grained sentiment 
information. 


“Context-Dependent Part-of-Speech” [36] 
involves analyzing the contextual information 
surrounding words to identify relevant POS chunks 
contributing to sentiment. Considering the larger 
syntactic context, the sentiment lexicon becomes 
more accurate and contextually relevant. This 
approach enables the disambiguation of words with 
multiple meanings and helps capture the intended 
sentiment in a given context. By constructing a 
sentiment lexicon based on context-dependent POS 
chunks, sentiment analysis models can provide more 
precise and _ reliable sentiment predictions, 
improving the overall accuracy and contextual 
understanding of sentiment in textual data. “Context- 
Specific Heterogeneous Graph Convolutional 
Network” [37] leverages the inherent contextual 
information present in the text by utilizing a 
heterogeneous graph structure. By representing 
words, entities, and their relationships as nodes in 
the graph, CSH-GCN captures the intricate 
dependencies and interactions between them. The 
model incorporates context-specific information to 
enhance sentiment analysis accuracy, considering 
the nuanced meanings of words in different contexts. 
Through graph convolutional operations, CSH-GCN 
effectively propagates information and captures the 
rich semantic relationships within the heterogeneous 
graph. This approach enables a comprehensive 
understanding of implicit sentiment in text data, 
making it valuable for applications such as social 
media sentiment analysis, opinion mining, and 
recommendation systems. The CSH-GCN model 
significantly improves the accuracy and granularity 
of sentiment analysis by leveraging context-specific 
information and heterogeneous graph structures. 


“Knowledge-Guided Sentiment Analysis” 
[38] leverages the knowledge embedded within 
human-generated explanations to improve the 


results. Training the model to learn from these 
explanations gains insights into the reasoning 
process and the linguistic cues that indicate 
sentiment. This approach allows the model to 
capture complex sentiment patterns and understand 
the underlying factors contributing to sentiment 
expressions. By integrating human knowledge and 
explanations, the knowledge-guided sentiment 
analysis model provides more reliable and 
transparent results, facilitating better decision- 
making in various domains, such as customer 
feedback analysis, social media monitoring, and 
opinion mining. “Entity-Sensitive Attention and 
Fusion Network” [39] addresses the challenges of 
sentiment analysis by considering both textual and 
visual modalities and their relationship with specific 
entities or targets within the text. ESAF-Net 
incorporates entity-sensitive attention mechanisms 
to dynamically weigh the importance of different 
modalities and their associated entities. The model 
effectively captures the sentiment expressed towards 
them by focusing on relevant entities. The fusion 
network integrates multimodal information and 
generates a comprehensive representation that 
combines textual and visual cues. This enables 
ESAF-Net to make accurate sentiment predictions at 
the entity level. The model is beneficial for 
analyzing sentiment in diverse multimedia sources, 
such as product reviews, social media posts, and 
online discussions, providing a deeper 
understanding of sentiment towards specific entities 
or targets. 


“Naive Bayes Classification Algorithm 
(NBCA)” [40] offers several benefits when 
employed in sentiment analysis. It is a simple and 
easily understandable method, making it accessible 
to individuals new to machine learning. With 
minimal parameter tuning, it reduces the complexity 
of model selection. NBCA performs well with high- 
dimensional data, making it suitable for sentiment 
analysis tasks with numerous features. Its efficiency 
in handling large volumes of text data is achieved by 
assuming independence between features, resulting 
in faster training and prediction times. Furthermore, 
NBCA is robust to irrelevant features and missing 
data without significantly impacting its 
performance. It also requires a relatively small 
amount of training data to estimate the necessary 
probabilities, making it suitable for scenarios with 
limited labelled data. “Random Forest Classification 
Algorithm (RFCA)” [41] is an ensemble learning 
algorithm that is highly effective in sentiment 
analysis tasks. It can capture nonlinear relationships 
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between text features and sentiment, making it 
suitable for scenarios where sentiment depends on 
various factors within the text. RFCA reduces 
overfitting by aggregating the predictions of 
multiple decision trees, leading to more reliable 
sentiment classification. Moreover, it provides 
feature importance measures, enabling analysts to 
identify the most informative features driving 
sentiment. This information aids in understanding 
key sentiment factors and feature selection. With its 
ability to handle complex patterns, mitigate 
overfitting, and offer interpretable feature 
importance, RFCA enhances sentiment analysis and 
effectively extracts sentiment from textual data. 


3.0 SOPHISTICATED ARTIFICIAL BEE 
COLONY-INSPIRED DEEP Q- 
NETWORKS (SABC-DQN) 


3.1. Modified Deep Q-Networks 


Ensemble deep reinforcement learning 
(RL) combines multiple agents in an ensemble to 
enhance classification tasks. With diverse 
architectures, training procedures, and initialization 
settings, agents predict class labels for input data. 
Training agents independently promotes diversity 
and exploration of the feature space. Aggregating 
predictions or class probabilities from multiple 
agents enables comprehensive _ classification 
decisions. Voting, averaging, or weighted averaging 
are used to aggregate outputs. Ensemble deep RL 
improves generalization, mitigates overfitting, 
captures a wide range of patterns, and enhances 
robustness by reducing individual errors or biases. It 
provides insights into prediction uncertainty and 
considers the trade-off between performance 
improvement and computational complexity, as 
training and inference procedures for multiple agents 
require additional computational resources and 
memory. 


Deep Q-Network (DQN) is an advanced 
classification algorithm that integrates deep neural 
networks with RL. DQN leverages deep neural 
networks’ power to handle high-dimensionality 
input data and effectively capture intricate patterns, 
facilitating accurate predictions in complex 
classification scenarios. The training process of a 
DQN-based classifier involves iterative interactions 
between the model and the environment. The model 
gradually improves performance by receiving 
observations and making decisions based on the 


current state. Evaluation is performed using a reward 
signal, which serves as a metric to assess the 
classifier’s effectiveness in the classification task. 
The model’s parameters are updated through a 
variant of the Q-learning algorithm known as deep 
Q-learning, which optimizes the model by 
minimizing the discrepancy between predicted and 
actual values derived from the reward signal. An 
advantageous characteristic of DQN is its ability to 
learn directly from raw input data, alleviating the 
need for extensive manual feature engineering. This 
capability proves particularly valuable when 
extracting meaningful features from the data is 
challenging or computationally demanding. 


3.1.1. Context-Aware Weight Adjustment 


The primary agent may not exhibit overall 
effectiveness in DQN. However, it can_ still 
demonstrate exceptional performance in specific 
situations. Unfortunately, the existing static fusion 
strategy fails to exploit this valuable information 
fully. Active fusion is introduced to address this 
limitation by actively adjusting the weights assigned 
to base agents based on their competence within the 
local context of a test state. By evaluating a base 
agent’s performance in specific regions, it gains 
more influence over statewide policy decisions. As a 
result, the evaluation of a base agent becomes more 
nuanced and is not solely determined by its overall 
performance. While a base agent may display 
inadequate performance across all states, instances 
may exist where it excels in specific states. 
Unfortunately, the static fusion approach fails to 
leverage this valuable insight effectively. The 
proposed active fusion approach provides credit to 
an agent based on its performance within the local 
context of a specific test state. The evaluation of a 
base agent’s performance within the local region of 
a state carries greater weight when formulating 
statewide policy decisions. This approach enables a 
more comprehensive evaluation considering 
performance variations across different regions. 


The framework utilizes unique primitives 
denoted as z for training all agent @, in the same & 
environment, where s ranges from | to z. By 
employing different training procedures, model 
topologies, and parameter settings, diverse agents 
can be created. Among the available options, the 
DQN is selected as the foundational agent for the 
framework due to its wide adoption and proven 
effectiveness. The  state-to-action transition is 
achieved using the function @,(e). To evaluate the 
performance of the agent @, on a given state e, 
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denoted as e w v(0(e, s)), the cumulative reward 
obtained over a t-step period is considered. This 
assessment can be conducted independently of the 
current state. Given a test state er at time f, the 
similarity to the states e m v is measured using the 
function @, represented as sim(e, ef, Qs). The most 
similar states e w v to ey are selected to construct a 
similar set S (ef.@s)° Subsequently, the local 


competence of g, on e-, denoted as (ZU(¢s, er), is 
calculated based on O(e, p,) where e w S (eps): The 


weight of @,, represented as N,, is periodically 
updated, specifically every v_ time-interval, 
according to ZU (9s, er), with f representing the 
update time. 


The ensemble technique begins with 
training z-base agents until convergence. The value 
of z can be determined based on the specific 
limitations and complexity of the application. 
Subsequently, an evaluation set v is created by 
randomly selecting states and observing the 
interactions of each base agent within its 
environment. Starting with a state in v, the 
effectiveness of the base agents is measured by 
evaluating the rewards obtained in the subsequent t 
stages. During testing, the weights of the base agents 
are regularly adjusted based on their local 
competence in managing states that are most similar 
to a particular test state within a radius of v. To make 
a final decision for a given test state, the choices 
made by all base agents are aggregated by averaging 
them. 


Algorithm 1. Context-Aware Weight 
Adjustment 


Input: 
e Number of base agents (z) 
e Evaluation set size (v) 
e Number of stages (t) 
e State-to-action transition function for 


the agent ~, (v;,(e)) 
e Similarity measure function between 
states e and e, for agent 


Ps (sim(e, ef, 9s)) 
e Performance assessment function for 
the agent @, on state eO(e, g,) 


Output: 
e Final decision for a given test state 


Procedure: 


E-ISSN: 1817-3195 

Step 1: Train z-base agents until they reach 
their optimal performance. 

Step 2: Create an evaluation set by 
randomly selecting a subset of 
states. 

Step 3: For each test state in the evaluation 
set: 

a. Initialize an empty list of 
weights. 

b.  Iterate over all base agents: 

e Using the similarity 
measure, calculate the 
similarity between the test 
state and each state 
observed during training. 

e Add the similarity 
measure to the list of 
weights. 

c. Normalize the weights so that 
their total sum equals 1. 

Step 4: For each test state in the evaluation 
set: 

a. Initialize an empty list of 
choices. 

b. Iterate over all base agents: 

e Compute the decision 
made by each base agent 
for the test state using its 
state-to-action transition 
function. 

e Multiply the decision by 
the corresponding weight 
calculated in step 3 and 
add it to the list of 
choices. 

c. Calculate the final decision for 
the test state by summing up 
all the choices. 

Step 5: Return the final decision for each 
test state in the evaluation set. 


3.1.2. Assessment Set 

The approach involves constructing an 
evaluation set called v to assess the expertise level 
of an agent in a specific area. Active fusion 
techniques in controlled and unstructured learning 
also employ similar methods. Generating v requires 
active interactions between the agent and the 
environment, ensuring that the examples in v 
accurately reflect the allocation of states used in the 
application. The formation of v relies on the 
capacities of each base agent. For each base agent 
@s, a subset of initial states is randomly selected 
across multiple episodes. The size of this subset is 
determined by |v|/z, where |v| represents the total 
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size of the evaluation set, and z corresponds to the 
total number of base agents. Consequently, v 
encompasses all possible starting points collectively 
chosen by the base agents. The magnitude of v, 
denoted as v, holds significant importance. 
Following the Principle of Large Numbers, a larger 
v brings the distribution of states within v closer to 
that of the natural state space. However, it is 
essential to note that increasing v also increases the 
evaluation time complexity. This occurs because 
each state within v needs to be compared with a test 
state for every base agent during the evaluation 
process. The experimental analysis in Section 4 
provides detailed insights into the impact of v. 


3.1.3. Performance Assessment 

In RL, where supervision is unavailable, 
evaluating an agent’s performance is an estimation 
process. When considering a state ey, the 
performance of the agent g,on that state is 
determined by the expected cumulative reward over 
the next t steps, taking into account a discount factor 
y. The performance is quantified using the following 
equation: 


O(e,~s) =H (>. _075f)) (2) 


where y represents the discount factor, and 


Br 5 denotes the immediate reward obtained by g,in 


state er,,;. Due to the non-deterministic nature of the 
transaction function in RL, the expected value is 
incorporated to account for the variability. The t 
value determines the number of future rewards 
associated with the state ey. It is generally not 
preferable to have a small value for t since the 
observations made on er may not provide sufficient 
information for accurate estimation. This leads to 
similar values of O for most states. Conversely, 
when t is large, the cumulative reward may not have 
a close relationship with the present decision, as it 
considers rewards from distant future steps. 


3.1.4. State of Resemblance 

The approach introduced in this study 
incorporates active fusion, which — enables 
performance enhancement even in scenarios where 
the primary agent’s overall effectiveness is limited. 
Active fusion involves assigning weights to base 
agents based on their competence within the local 
context of a test state. Their expertise can be 
effectively leveraged by considering — the 
performance of base agents in specific regions or 
states. Unlike static fusion approaches, which fail to 
fully utilize the potential of base agents that may 


excel in specific states but perform poorly overall, 
active fusion overcomes this limitation. It evaluates 
base agents based on their performance within the 
local context of a test state, giving more weight to 
agents that demonstrate high performance in that 
particular region. To implement active fusion, an 
evaluation set called v is created. This set comprises 
carefully selected states that reflect the allocation of 
states used in the application. Each base agent 
contributes to the formation of v based on its 
capacities. For each base agent @,, a subset of initial 
states is randomly selected from v across multiple 
episodes. The size of this subset is determined by 
dividing the total size of v|v| by the total number of 
base agents (z), ensuring a fair representation of all 
possible starting points collectively chosen by the 
base agents. 


The size of v, denoted as v, holds 
significance in this approach. Increasing the size of 
v brings the distribution of states within v closer to 
that of the natural state space, adhering to the 
Principle of Large Numbers. However, it’s 
important to note that a larger v also increases the 
evaluation time complexity. Every state within v 
must be compared with a test state for every base 
agent during the evaluation process. In terms of 
performance assessment, equation (3) is employed to 
measure the similarity between two states, e, and e2, 
using a base agent @,: 


sim(y, €2, Ps) = dist(t,(e1), Ts(€2)) (3) 


Here, t,(e,) represents the output of the final 
convolutional layer applied by the agent @, to state 
e,, and T,(e,) represents the output for the state e,. 
The function dist() calculates the distance between 
the latent representations of the states, utilizing the 
2-norm length measure (Euclidean distance). 


3.1.5. Local Competency 

The local competence denoted as 
ZU (95, er), plays a crucial role in assessing the 
proficiency of the agent @, in the vicinity of a given 
state, er. This measure provides insights into how 
well @, performs in the local context of er. Two key 
components are employed to determine the local 
competence: O, as defined in equation (2), and sim, 
as defined in equation (3), utilizing the validation set 
v. When encountering an unseen state, er, a careful 
selection process takes place from the validation set 
v to identify the nearest neighbors of e-. These 
neighbors are chosen based on the guidelines 
outlined in equation (3) for agent g,. A comparable 
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state set is formed, denoted as S(y.). However, to 
ensure a fair evaluation, the final comparable state 
set, denoted as S, is constructed by combining 
S(g,) from all participating base agents. It is 
important to note that the size of S must be greater 
than or equal to a to ensure adequate representation. 
The width of S is directly proportional to the value 
of a, indicating the number of neighbors considered. 


It is worth mentioning that the size of S is a 
critical factor. If S is too small, it may not provide 
enough data to assess the local competence of @, for 
ey reliably. On the other hand, a large value could 
introduce a significant disparity between e, and the 
states within v, shifting the focus away from the 
local region. The mean value of O for the agent @, 
across all states within S is calculated to quantify the 
local competence. This can be expressed as Eq.(4). 


ZU(9s, er) = Heas(OCe, s)) (4) 


By considering the states within S, which 
have a solid connection to e, optimizing agent 
@, specifically for this set can lead to improved 
performance in the local context. As a result, the 
local proficiency for state ef, indicated by 
ZU (Qs, er), is expected to exhibit significant 
enhancements. 

Algorithm 2: Local Competence Evaluation 


for Base Agents 
Input: 
e State es: The unseen state to be 
evaluated. 


e Base agents’ performances O(e,@,): 
Performance values of base agents on 
different states. 

e Validation set v: A set of states used for 
evaluating local competence. 


Output: 
e Local competence ZU (9s, er): The 
measure of the base agent’s proficiency 
in the local context of er. 


Procedure: 
Step 1: Initialize the comparable state set S 
as an empty set. 
For each base agent @,: 
e Select the nearest neighbors of 
ey from the validation set v 


based on similarity 


Step 2: 


measurement. 


e Add these neighbors to S(g,). 
Combine Sg.) from all base agents 
to form the comparable final state 
set S. 

If |S| < a, increase the size of S by 
adding more states from v until 
|S| =a. 

Calculate the local competence 
ZU (9,, er) as the mean value of 
O(e, y,) for all states e in S. 
Return the local competence 
ZU(9,, er) as the output. 


Step 3: 


Step 4: 


Step 5: 


Step 6: 


3.1.6. Active Weight Distribution 

The assignment of weight to e,, labelled as 
n(9s5, er), is determined by evaluating function ~, 
through the calculation provided by ZU (%;, ef). 
This computation, defined as the division of 
ZU (95, er) by the sum of ZU(95, ef Jover all values 
of s from | to z, is the basis for weight allocation. 
The expression for this calculation is represented as 


Eq.(5). 
ZU(9s, er) 


5 
2_, ZU(9s,er) : 


n( Ps, er) = 


In RL, the interdependence of consecutive 
states assumes paramount importance. Abrupt 
changes in the weights of underlying agents can 
significantly impact the learning process. 
Consequently, when updating the weight of @,(N.), 
it becomes crucial to incorporate information from 
the previous weight, N,, using the derived value of 
n( 935, er). The process of adjustment can be 
formulated as Eq.(6). 


N= ae f=0 
dn( gz, er) +(1-5)N, f>0 
Introducing an update parameter, 5 allows for 


controlling the influence of n(9s, er) on the weight 
update process. 


(6) 


Shifting the focus to another aspect, the 
notion of mass updates for a fundamental substance 
regularly arises, denoted as v. The value of v lies 
within the set of positive integers, v w I*. Striking 
the right balance for v assumes critical importance 
as it directly impacts the frequency of weight 
updates and time complexity. Frequent weight 
updates occur when v assumes a small value, 
potentially affecting the stability of the model’s 
performance. Conversely, if v is set too high, the 
weight values may not adequately reflect the 
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competence of the base agents under current 
conditions. Consequently, selecting an appropriate 
value for v assumes significance in our model, 
warranting cautious consideration. To derive the 
final value X (e,, dr) utilized for decision-making, 
Eq.(7) is utilized. 


K(ep, dy) = 


Ys=1 N75 Softmax (x, (e,, dr)) ) 


To mitigate the impact of the range of 
X, on the final decision, SoftMax function is utilized 
to the individual X, values before fusing them. This 
process ensures that the algorithm selects the action 
d, with the highest value of xX (e,, dr), facilitating 
effective decision-making. 


3.1.7. Complicated Timing 

Active fusion methods, known for their 
flexibility and adaptability, often require additional 
computational time compared to static fusion 
techniques. Our method employs an active fusion 
approach by determining the closest neighbor subset, 
denoted as S(gs)> within the z feature spaces. To 
achieve this, we calculate active weights based on 
the distance between the current state of ey and all 
other states in the evaluation set v. Computing the 
active weights involves measuring the dissimilarity 
or similarity between the present state of er and each 
state in the evaluation set. By quantifying the 
distance metric, we can assign weights that reflect 
the relevance or importance of each neighboring 
state concerning the current state. This active 
weighting scheme enables us to incorporate the 
varying degrees of similarity between states into the 
fusion process. 


It is essential to note that this active fusion 
method does introduce increased time complexity. 
The time complexity is proportional to K, the 
number of features in each state multiplied by az, the 
number of neighboring states, and the size of the 
evaluation set, |v]. Consequently, the computational 
requirements may be higher than static fusion 
methods, mainly when dealing with larger feature 
spaces or evaluation sets. To mitigate the potential 
impact of increased time complexity, it is worth 
highlighting that the computation of the distance 
metric can be parallelized. Parallelization allows the 
simultaneously distributing of the workload across 
multiple processing units, such as CPUs or GPUs. 
Leveraging the power of parallel processing can 
significantly reduce the overall calculation time, 
particularly in scenarios with strict real-time 


requirements. While active fusion methods may 
demand more computational time, the potential 
benefits of adaptability and parallelization can make 
them suitable for applications that prioritize real- 
time processing and require flexible fusion 
approaches. 


Algorithm 3: Active Fusion with Timing 
Considerations 
Input: 
e ey: The current state of the system 
e ov: The evaluation set of neighboring 
states 
e  z: The number of feature spaces 
e K: The number of features in each state 


Output: 
¢ —S(g,): The closest neighbor subset within 
the z feature spaces 


Procedure: 
Step 1: Initialize an empty set S(,,) to store 

the closest neighbor subset. 

For each state @, in the evaluation set 

v, do the following steps: 

e Calculate the distance metric 
between e, and ~, based on the 
feature spaces. 

e §6Assign a active weight to @, 
based on its distance from er. 

Sort the states in v based on their 
active weights in ascending order. 
Select the top K states with the 
lowest active weights and add them 
to Si). 

Return Sig.) as the closest neighbor 


subset within the z feature spaces. 


Step 2: 


Step 3: 


Step 4: 


Step 5: 


Algorithm 4: Modified DQN 


e Number of base agents (Z) 

e Evaluation set size (v) 

e Number of stages (t) 

e §©State-to-action transition function for 
agent 9,(@,(e)) 

e Similarity measure function between 
states e and ey for agent sim(e, ef Qs) 

e Performance assessment function for 
agent @, on state e(e, Ms) 


Output: 
e _ Final decision for a given test state 
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Procedure: 
Step 1: Train z-base agents until they reach 
their optimal performance. 
Step 2: Create an evaluation set by randomly 
selecting a subset of available states. 
Step3: For each test state e, in the 
evaluation set: 
a. Initialize an empty list of 
weights. 
b.  Iterate over all base agents @,: 
e Calculate the — similarity 
between the test state e- and 
each state observed during 
training using the similarity 
measure function 
sim(e, ef, Ps): 

e Add the similarity measure to 
the list of weights. 

c. Normalize the weights so that 
their total sum equals 1. 

Step 4: For each test state e, in the 
evaluation set: 

a. Initialize an empty list of 
choices. 

b. Iterate over all base agents @,: 

Step 5: Compute the decision made by each 
base agent for the test state e, using 
its state-to-action transition function 
Ps (e). 

a. Multiply the decision by the 
corresponding weight 
calculated in step 3.b.ii and add 
it to the list of choices. 

b. Calculate the final decision for 
the test state er by summing up 
all the choices. 

Step 6: Return the final decision for each test 
state in the evaluation set. 


3.2. Sophisticated Artificial Bee Colony 

The Sophisticated Artificial Bee Colony 
(ABC) algorithm was developed based on the 
behavior of honey bees. Within a hive are three 
species of bees, including onlookers and scout bees, 
considered working bees. Each bee has a connection 
between its food source and a potential solution to a 
problem being addressed. These bees collaborate to 
discover the most optimal food sources for the 
colony. The most important food sources are 
documented in an online archive, regularly updated 
using local search procedures and artificial bees. The 
archive contains a collection of potential food 
sources that attract the attention of the onlookers 


through dance, and the working bees recall this 
information. This differs from the ABC (Artificial 
Bee Colony) algorithm. The onlookers gather around 
the dance area to choose a food source based on 
quality. The SABC algorithm utilizes the same 
number of bees and onlookers as the traditional ABC 
algorithms. 


Ifa food supply cannot be optimized within 
a predefined number of trials, known as the “limit,” 
a scout bee will randomly search for a new food 
source. Once a food source is added to the archive, it 
is removed from the current list of any other 
contender that dominates it. The truncation operation 
of the archive’s non-dominant candidates is 
triggered when the maximum file size limit of the 
archive is exceeded, using a hyper-volume indicator. 
However, the performance of this approach 
deteriorates with frequent usage. If solution A 
represents the maximum limit of non-dominated 
solutions (i.e., the desired output), the archive size is 
set to 2D. The truncation process removes solution 
A if the number of candidates present is denoted as 
P,, exceeds p,. To ensure a constant replenishment 
of food supply, the algorithm employs the following 
greedy selection ideas: if the bees ignore a new food 
source, they will continue feeding on their current 
one; otherwise, the trial count for the new food 
source is increased by one. 


3.2.1. Initialization 

During the initialization process of the 
algorithm, a set of food sources is randomly assigned 
to the bees in the colony. The variables represent this 
set P,,Pp,....,P;, where each P;, corresponds to a 
specific food source. These food sources serve as 
potential solutions to the problem at hand. 


The concept of dominance determines 
which food sources should be preserved for future 
use. A food source P; is said to dominate another P; 
if it performs better or achieves a higher objective 
value in all the problem dimensions. In other words, 
P,is considered superior to P,; regarding its solution 
quality. Only those not dominated by other sources 
are selected for preservation among the assigned 
food sources. These non-dominated food sources are 
stored in an archive, which acts as a repository of 
valuable solutions that have the potential to guide the 
algorithm towards better outcomes. The size of this 
archive is denoted by C, which is set to half the 
colony’s size. The initialization procedure, 
Algorithm 5, outlines the steps involved in this 
process. It encompasses the random assignment of 
food sources to the bees, evaluating their dominant 


eee ener 
2347 


Journal of Theoretical and Applied Information Technology 
31° March 2024. Vol.102. No 6 


© Little Lion Scientific 


ISSN: 1992-8645 


www. jatit.org 


SATIT 


E-ISSN: 1817-3195 


relationships, and selecting non-dominated sources 
for inclusion in the archive. By carefully preserving 
non-dominated food sources during initialization, 
the algorithm ensures that promising solutions are 
retained and available for future exploration. This 
initialization phase sets the foundation for the 
subsequent iterations of the algorithm, where bees 
will further explore and refine these initial solutions 
in search of better outcomes. 


Algorithm 5: Initialization 

Step 1: Initialize an empty archive, denoted 
as “Archive.” 

Step 2: For each bee in the colony (i.e., for 
i = 1toN): 

e Generate a random food 
source, P;, associated with the 
bee. 

Step 3: For each food source in the 
generated set (i.e., fori = 1 toc): 
e Set a dominance flag, “is 

Dominated,” as false. 

e Compare the current food 
source, P;, with all other food 
sources in the generated set 
(excluding itself). 

Step 4: — For each food source P; (where j # 

i) in the generated set: 

e IfP; dominates P; (i.e., P; is 
superior in all problem 
dimensions), the set is 
dominated as accurate. 

e If Domination is true, break 
the comparison loop. 

e If Domination is false, add P; 
to the archive. 

Step 5: — If the archive size exceeds C, 
perform truncation to reduce its size 
to C (e.g., by removing the least 
promising or oldest solutions). 

Step 6: Return the archive containing the 
preserved non-dominated food 
sources. 


3.2.2. Sending of Employed Bees 

When a new food source is discovered for 
an employed bee, as indicated by Eq.(8), the 
selection of that source depends on several factors. 
These factors include the currently remembered food 
source and three distinct food sources that 
neighboring bees recalled precisely. pg, p; and p;,. In 
the SABC (Scout-ABC) algorithm, neighbors refer 
to all other employed bees present, excluding the bee 


itself. The mathematical representation of this 
process is given by Eq.(8). 

Psw 
+} Ps,w if b, <0.5 (8) 
~ ie + by * (pjw — Piw) otherwise 


In Eq.(8), two random variables, b, and b5, 
are utilized. These variables are selected from the 
range [0,1] and play a significant role in the 
calculation. They introduce randomness and 
variability to the bee-sending process, enabling 
diversity in the search for optimal food sources. 


The specifics of the bee-sending process, 
including the update of food sources based on 
Eq.(6.1), are outlined in Algorithm 2. Additionally, 
the algorithm considers the trial number of the food 
source, denoted as F(p), which represents the 
number of times the particular food source has been 
explored and evaluated. By incorporating these 
calculations and considering the current food source, 
neighboring food sources, random variables, and 
trail number, the algorithm guides the bees in 
determining the updated food source for each 
employed bee. This process contributes to exploring 
and potentially improving food sources within the 
SABC algorithm. 


Algorithm 6: Employed Bee 


Step 1: Set the updated food source, 
denoted as p; ,, to be the same as 
the current food source, psy, . 
Generate two random numbers: 
b,(w[0,1]) and b,(w[0,1]). 

If b, < 0.5, then:Set pj y aS Psy - 

(i.e., no change in the food source) 

If bz = 0.5, then: 

e §©Calculate the difference 
between the recalled additional 
food sources: Set diff = pj yw — 
Plw- 

e Calculate the updated food 
source psy based on 
Eqn.(6.1):Set pgw Psw = 
Pgw + by, * diff. 

Update the trail number of the food 

source, F(p), by incrementing it by 

1(F(p) = F(p) + 1). 

Return the updated food source, 

Psw, along with its updated trail 


number, F(p). 


3.2.3. 


Step 2: 
Step 3: 


Step 4: 


Step 5: 


Step 6: 


Forming of New Food Sources 
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To generate new food sources, bees must 
select potential options from the existing archive, 
which consists of previously identified food sources 
by other bees. However, for safety reasons, any food 
source in the archive that exceeds the defined “limit” 
for its trial number is not considered for selection. 
The food sources in the archive can be categorized 
into different groups based on their trial numbers, as 
described in Eq.(9): 

Ey = {p w archive|F(p) = 0} 
E, = {p warchive|F(p) = 1} (9) 
E, = {p w archive|F(p) =a} 
where a is a value less than the limit, these groups 
represent the Archive subsets containing food 
sources with specific trail numbers. The notation 
F(p) represents the trial number of a particular food 
source p. 


Suppose the number of food sources in the 
archive with trail numbers below the limit, denoted 
as |D|, is less than or equal to the desired archive size 
C. In that case, the selection process prioritizes the 
food sources with the fewest trial numbers for 
inclusion. On the other hand, if |D| exceeds C, the 
selection prioritizes food sources with the minimum 
trial numbers. The C — |D| remaining food sources 
are retained in the archive and selected from the 
memories of employed bees. These memories 
contain information about previously visited food 
sources. Algorithm 3 provides a detailed description 
of this selection procedure, taking into account the 
considerations mentioned above. 


Algorithm 7: New Food Sources 
Step 1: Initialize an empty set, D, to store 


food sources with trail numbers less 

than the limit. 

For each food source p in the 

archive: 

a) IfF(p) < limit, add p to D. 

If|D| < C: 

a) Select the C — |D| food 
sources with the fewest trail 
numbers from the employed 
bees’ memories and add them 
to the selection. 

b) Return the selected food 
sources for the next iteration. 

If|D| > C: 

a) Sort the food sources in D 
based on their trail numbers in 
ascending order. 

b) Select the C food sources with 
the minimum trail numbers 


Step 2: 


Step 3: 


Step 4: 


from D and add them to the 
selection. 

c) Return the selected food 
sources for the next iteration. 


3.2.4. Evaluation of Food Sources 
The SABC algorithm uses a novel approach 

to evaluate potential food sources based on factors 
such as dominant strength and distance from a 
source. This approach divides the X-th set of 
possible food sources into multiple disjoint subsets, 
as depicted in Eq.(10): 

Ei, = {dwx|V: vwX,e.f:v' 
E, = {dwX —-E\|V: vw X — Ej, e.f 


s-1 , s-1 d 
T t T 0) 
E,= dox-| JE iwvox-|)m 
w=1 w=1 
Each subset £;, where z € [1,s], 


represents individual food sources. The domination 
strength and crowd distance for each food source p € 
E;, are calculated using Eq.(11) and Eq.(12): 
Zz 
strength(p) = (1 = =) + 1/s, (11) 
Ss 
distance (p) 
_ LX«!=0)G'(p) — G'(~)| 
— s #e(IEZ|-1) 
where G'(p) represents the objective function with 
normalized values for p, the crowd distance of a food 
source falls within the range of 0 and 1/s, indicating 
the extent to which a particular food source has been 
exploited. The food source attraction, denoted as 
M(p), is mathematically defined in Eq.(13): 
M(p) 


(12) 


(X wE}) 


strength(p) + distance (p) (13 


~ y*! (strength(p,) + aeeanteta) 


As a result, the attraction of a food source 
increases when compared to other non-dominated 
food sources. Additionally, among two non- 
dominant food sources, the one with a sparser 
distribution of food sources will be more desirable. 
These calculations and evaluations enable the SABC 
algorithm to effectively assess and _ prioritize 
selecting food sources based on their dominant 
strength, crowd distance, and overall attraction. 
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Algorithm 8: Food Sources Evaluation 


Step 1: Divide the set X of possible food 
sources into s disjoint subsets: 

a) Initialize an empty list Ey. 

b) For each food source d in X: 

e Check if no other food 
source v in X infinitely 
dominates d. 

e = Ifno such food source is 
found, add d to E}. 

c) Add EF; ‘to the list of subsets E. 
d) For each subset EY in E, where 
z ranges from 2 to s: 

e Initialize an empty list 
E,. 

e Foreach food source d in 
X - UG EW: 
¥ Check if no other 

food source v in X — 


ue Ey, infinitely 


(w=1) 
dominates 
¥Y If no such food 
source is found, add 
dto E;. 
e Add E to the list of 
subsets E.. 

Step 2: Calculate the strength of 
domination and crowd distance for 
each food source p in the subsets 
E,,, where z ranges from | to s: 

Step 3: For each p in E_z”’: 

a). Calculate the strength of 
domination: 
e §=©Calculate the dominance 
factor (1 — z/s) and 
add 1/s. 
e =6 Assign the result as the 
strength of p. 


b). Calculate the crowd distance: 

e For each food source « 
in EY: 

e Calculate the absolute 
difference between the 
objective function values 
of p and «, denoted as 


G'(p) and = G'(«), 
respectively. 

e Accumulate the 
differences. 


e Divide the accumulated 
differences by (s* 


E-ISSN: 1817-3195 
c(|EZ| - 1)), where c is 
a constant. 
e =6 Assign the result as the 
crowd distance of p. 

Step 4: Calculate the food source attraction 
M(p) for each food source p in X: 

Step 5: Foreachp in Xx: 

e Calculate the total 
strength and distance in 
X as the sum of 
strength(p,) and 
distance(p,) overall 
food sources p,. 

e Calculate the attraction 
M(p) as the ratio of the 
sum of strength and 
distance of p to the total 
sum. 

Step 6: Select the food sources with the 
highest attraction values as_ the 
selected food sources for the next 
iteration. The number of selected 
food sources should match the 
desired archive size C. 

Step 7: _ Return the selected food sources. 


3.2.5. Sending of Onlooker Bees 

When an onlooker bee is attracted to a 
particular food source p,, it proceeds to identify 
another potential food source p, using the following 
equation: 

Ps.w = Psw + bx (Piw _ Piw) 

This equation, p; and p,, represents randomly 
selected food sources from the available options. 
The variable b, which ranges from 0 to 1, is arandom 
variable. The value of w ranges from | to t, where t 
represents the total number of candidates. This 
equation allows the onlooker bee to explore and 
evaluate alternative food sources based on a 
combination of the selected sources and random 
factors. Algorithm 9 outlines the steps involved in 
sending observer bees, which involves the selection 
of food sources by the onlooker bees. 


Algorithm 9: Onlooker Bees 


Step 1: Initialize an empty list to store the 
selected food sources by the observer 
bees. 

Step 2: For each observer bee (from | to ¢), 


perform the following steps: 
a). Select a food source p, randomly 
from the available options. 
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b). Generate two random numbers 
p; and p;, to represent the indices 
of other food sources in the list. 

c). Calculate the new food source 
Psw using the equation: 


Step 3: Psw = Pisw) +b * PGw) — Paw) 

Step 4: Add the newly calculated food source 
ps to the list of selected food sources. 

Step 5: Return the list of selected food 


sources by the observer bees. 


3.2.6. Sending of Scout Bees 

When the food supply is exhausted, a bee in 
the standard ABC algorithm takes on the role of a 
scout bee and actively searches for a new food 
source. However, in the case of the classic ABC 
algorithm, when a bee’s food supply is depleted, it is 
transformed into a scout bee. This scout bee is sent 
out again to locate a new food source. In the SABC 
algorithm, a specific approach is implemented to 
simulate the behavior of artificial bees. It involves 
performing a “limit” check to determine if any food 
sources have exceeded a certain number of cycles 
and can be further improved. If such cases are 
identified during this iteration, at least one food 
source will be eliminated. The details of dispatching 
scout bees are provided in Algorithm 10, which 
outlines the specific steps involved in this process. 


Algorithm 10: Scout Bees 

Initialize an empty list, scout_bees, to 

store the potential scout bee actions. 

For each food source p in the colony: 

e If the number of cycles for p 
exceeds the limit, add p to 
scout_bees. 

If scout_bees is not empty: 

e Select a random food source p' 
from scout_bees. 

e Replace the food source p’ with 
a new randomly generated food 
source. 

Return the updated colony with the 

potential scout bee actions. 


Step 1: 


Step 2: 


Step 3: 


Step 4: 


3.2.7. Performing the Local search 

The local search strategy encompasses two 
essential stages: selection and neighbourhood 
exploitation. In the selection stage, the algorithm 
identifies the most promising locations, considering 
them as the neighbourhoods of the selected 
candidates. The concept of “neighbourhood” refers 
to the possibility of exploring a candidate’s solution 
space by modifying one of its dimensions. This 
approach allows for a focused exploration of nearby 


solutions. As the intensity of the local search 
increases, reaching a certain threshold, it becomes 
challenging to find better candidates. This implies 
that searching for a new food source within the 
vicinity takes longer and reduces the overall 
effectiveness of the algorithm. The SABC algorithm 
employs a termination criterion for the local search 
operation to address this. If a neighbourhood has 
been searched a specified number of times, referred 
to as the “limit,” without finding a superior 
candidate, the search is discontinued. 


During the neighbourhood exploitation 
phase, the algorithm evaluates the potential of other 
candidates’ neighbourhoods to identify the most 
promising ones in the current context. Specifically, 
for the selected candidates, denoted as d,, the 
algorithm generates a set of t candidates 
(01, P2,+++, Pt) using Eq.(14). The generation of 
candidates involves applying equation (6.7), which 
contains the relevant calculations and 
transformations necessary to explore and exploit the 
neighbourhoods of the selected candidates. This 
process aims to identify new and potentially superior 
solutions within the local search space. By 
incorporating the selection and neighbourhood 
exploitation stages, the SABC algorithm optimizes 
its search for improved candidates and enhances its 
ability to discover more promising solutions. 

Ps,w 

—_ ifw#ts 

~ ie + b,(0V" — ZV”) ifw#s 
where ZV“ and OV* represents the lower bound and 
upper bound with the dimension of w w [1,t], and 
bz w [0,1] indicates a variable used for generating a 
random number. Algorithm 11 shows the process of 
local searching. 


(14) 


Algorithm 11: Local Search 
Step 1: Initialize an empty list to store the 


selected candidates. 

Select the most promising 

candidates from the list based on 

their potential, considering them as 
the initial neighbourhoods. 

For each selected candidate, 

perform the following steps: 

a). Generate t candidates by 
applying the neighbourhood 
exploration equation (Equation 
6.7) using the candidate’s 
information and the 
neighbourhood exploration 
parameter (0). 


Step 2: 


Step 3: 
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b). Evaluate each generated 
candidate to determine its 
quality or fitness. 

c). Ifa generated candidate is 
better than the currently 
selected candidate, replace it. 

d). Repeat steps 3a-3c until t 
candidates have been 
evaluated. 

If the search limit (limit) for 

neighbourhood exploration is 

reached for a candidate’s 

neighbourhood without finding a 

superior candidate, terminate the 

local search for that neighbourhood. 

Return the list of selected 

candidates. 


Step 4: 


Step 5: 


The SABC method is shown in further 
depth in Algorithm 12 using the earlier component. 


Algorithm 12: SABC 


Input: 
e Population size (C) 
e Search limit for neighbourhood 
exploration (limit) 
e Maximum size of the external 
population Archive (D) 
Output: 
e Final archive containing selected food 
sources 
Procedure: 

Step1: Allocate space for the external 
population Archive. 

Step 2: Initialize the food sources H by 
calling the initialization function, 
passing the population size (C) and 
the archive. 

Step 3: While the termination condition is not 


satisfied, repeat the following steps: 

a). Send employed bees to explore 
and exploit the food sources in 
H. Call the 
send_employed_bees function, 
passing H as the input. 

b). Perform local search operations 
on the archive using the 
LocalSearch function. 

c). Form new food sources by 
updating the archive based on 
the food sources in H. Call the 


forming newfoodsources 


function, passing the archive 
and H as inputs. 

d). Evaluate the quality of the food 
sources in H using the Ey 
strategy. 

e). Send onlooker bees to select 
promising food sources. Call 
the send_onlookers function. 

f). Send scout bees to explore new 
food sources. Call the 
send_scouts function. 

Once the termination condition is 
satisfied, return the final archive. 


Step 4: 


The SABC algorithm starts by allocating 
space for the external population archive. It then 
initializes the food source H using the initialization 
function. The algorithm proceeds with a loop until 
the termination condition is met. Within each 
iteration, employed bees are sent to explore and 
exploit the food sources in H. Local search 
operations are performed on the archive to enhance 
the quality of the solutions. New food sources are 
formed by updating the archive based on the food 
sources in H. The quality of the food sources in H is 
evaluated using the Ey strategy. Based on their 
evaluations, onlooker bees are sent to select 
promising food sources. Additionally, scout bees are 
deployed to explore new food sources. Once the 
termination condition is satisfied, the final archive is 
returned as the algorithm’s output. The SABC 
algorithm combines various __ bee-inspired 
mechanisms to search for high-quality food sources 
efficiently. By iteratively exploring, exploiting, and 
evaluating the solutions, it aims to converge towards 
optimal or near-optimal solutions. 


3.3. Fusion of SABC and DQN 

The fusion of the Sophisticated Artificial 
Bee Colony (SABC) algorithm and Deep Q- 
Networks (DQN) can lead to the development of a 
novel approach for sentiment classification known 
as Sophisticated Artificial Bee Colony-Inspired 
Deep Q-Networks (SABC-DQN). SABC is a 
metaheuristic optimization algorithm inspired by the 
foraging behavior of honeybees. It utilizes multiple 
artificial bees to explore the search space and exploit 
promising solutions. SABC excels at finding optimal 
or near-optimal solutions to various optimization 
problems. DQN, on the other hand, is a 
reinforcement learning algorithm that combines 
deep neural networks with Q-learning. It has been 
widely used in game playing and_ robotics, 
demonstrating its ability to learn optimal policies 
through trial and error. 
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By fusing SABC and DQN, we can 
leverage the strengths of both approaches to enhance 
sentiment classification. The SABC component can 
efficiently explore and exploit the search space, 
allowing for better identification of sentiment- 
related features or patterns. The DQN component 
can utilize its deep neural network architecture and 
reinforcement learning capabilities to learn and 
improve sentiment classification policies over time. 
In the context of sentiment classification, SABC- 
DQN can be employed as follows: 

e Initial feature selection: SABC can be 
utilized to explore and select the most 
relevant features from a given dataset, 
optimizing the feature representation for 
sentiment analysis. 

e Training phase: DQN can be trained using 
the selected features as input, with 
sentiment labels as the target. The DQN 
learns to predict sentiment labels based on 
the given feature representations, adapting 
its policy through reinforcement learning. 

e §6Testing phase: The trained SABC-DQN 
model can be used to classify the sentiment 
of new, unseen text samples by leveraging 
the learned policy to make accurate 
predictions. 


The fusion of SABC and DQN in SABC- 
DQN for sentiment classification allows for a more 
sophisticated and efficient approach to sentiment 
analysis tasks. By incorporating SABC’s exploration 
and exploitation capabilities with the deep learning 
and reinforcement learning capabilities of DQN, the 
SABC-DQN model can _ potentially achieve 
improved sentiment classification performance 
compared to traditional methods. 


4.0 ABOUT THE DATASET 


The “Course Reviews on Coursera’”’ dataset 
provides a wealth of information about user ratings 
and reviews for courses available on the Coursera 
platform. With over 1.45 million reviews and 
corresponding ratings, this dataset offers a valuable 
opportunity to gain insights into user preferences and 
sentiments. Each review entry in the dataset contains 
the course review text, the reviewer’s name, the date 
when the review was posted, the rating the reviewer 
gave, and the course ID. Analyzing this dataset 
allows researchers to understand the factors 
influencing user ratings, identify courses with high 
or low ratings, and explore the relationship between 
review text sentiment and rating scores. Using 


natural language processing techniques, sentiment 
analysis algorithms can classify review texts as 
positive, negative, or neutral. This analysis can help 
identify the most common sentiments expressed by 
users, understand the factors that contribute to 
positive or negative reviews, and potentially predict 
the sentiment of new reviews. Moreover, by 
examining the distribution of rating scores, it is 
possible to identify the most highly-rated courses on 
Coursera, investigate trends in user ratings over 
time, and uncover any biases or patterns in the 
ratings given by different reviewers or institutions. 


Table 1. Dataset: Coursera Courses 


Data ve 
Feature Description 
Type 
The name of the 
name character 
course 
ae nee The institution 
institution | character : 
offering the course 
course_url| character | The URL of the course 
; The unique identifier 
course_id | character 
for the course 


Table 2. Dataset: Coursera Reviews 


Data ve 
Feature Description 
Type 
. The text of the 
reviews character . 
course review 
The name of the 
reviewers | character | reviewer who 


wrote the review 

The date when the 
review was posted 
The rating score is 


date_reviews| date 


rating integer il ae 
reviewer of the 
course 
The unique 
. identifier for the 
course_id character 


course associated 
with review 


5. RESULTS AND DISCUSSION 


5.1. Classification Accuracy and F-Measure 
Analysis 

The NBCA is a probabilistic classifier that 
utilizes Bayes’ theorem. Given the class labels, it 
assumes that the features are conditionally 
independent, which is a simplistic but often 
reasonable assumption. NBCA calculates the 
probability of an instance belonging to a particular 
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class based on observed feature values and assigns 
the class label with the highest probability. The 
results in Table 3 reveal that NBCA achieved a 
classification accuracy of 50.833% and an F- 
measure of 51.315%. These outcomes indicate that 
NBCA exhibited relatively lower performance than 
the other classifiers. The feature independence 
assumption might have limited its capability to 
capture complex relationships in the data, resulting 
in reduced accuracy. 

RFCA is an ensemble learning method that 
combines multiple decision trees to make 
predictions. It constructs a forest of decision trees, 
each trained on a different subset of the data and 
using a random subset of features. During prediction, 
RFCA aggregates the predictions from all the trees 
to make the final decision. According to Table 3, 
RFCA achieved a classification accuracy of 
65.003% and an F-measure of 64.728%. These 
findings indicate superior performance compared to 
NBCA. By harnessing the power of ensemble 
learning, RFCA was able to capture more intricate 
patterns in the data, resulting in improved accuracy. 


SABC-DQN is an advanced classifier that 
combines concepts from artificial bee colony 
optimization and deep Q-networks. It integrates 
reinforcement learning techniques with a_bee- 
inspired optimization algorithm to train a deep 
neural network for classification tasks. SABC-DQN 
leverages the exploration-exploitation trade-off to 
search for optimal solutions efficiently and learns to 
make accurate predictions through a reward-based 
learning process. As per Table 3, SABC-DQN 
achieved the highest classification accuracy of 
87.325% and the highest F-measure of 87.620%. 
These results illustrate the exceptional performance 
of SABC-DQN compared to NBCA and RFCA. By 
harnessing deep learning and_ reinforcement 
learning, SABC-DQN could learn intricate data 
representations, adapt predictions based on 
feedback, and achieve heightened accuracy and F- 
measure. 

Table 3. Classification Accuracy and F-Measure 


Results 
Classifiers CA FM 
NBCA 50.833 51.315 
RFCA 65.003 64.728 
SABC-DQN 87.325 87.620 


The NBCA assumes feature independence, 
RFCA utilizes ensemble learning with decision 


E-ISSN: 1817-3195 
trees, and SABC-DQN combines deep and 
reinforcement learning. The outstanding 


performance of SABC-DQN can be attributed to its 
ability to capture complex patterns and optimize the 
classification process by integrating sophisticated 
techniques. 


ONBCA ORFCA OSACO-DQN 


Performance Metrics 


Figure 1. Classification Accuracy and F-Measure 


5.2. Fowlkes-Mallows Index and Matthews 
Correlation Coefficient Analysis 

The Fowlkes-Mallows Index (FMI) 
measures the similarity between two clusters or sets 
of data points. It calculates the geometric mean of 
the precision and recall values, which are metrics 
commonly used in information retrieval and 
classification tasks. The Matthews Correlation 
Coefficient (MCC) is a metric commonly used to 
evaluate the performance of binary classification 
models. 

Figure 2 represents the Fowlkes-Mallows 
Index (FMI) and Matthews Correlation Coefficient 
(MCC) values for three classification algorithms: 
NBCA, RFCA, and SABC-DQN. Table 4 provides 
the specific FMI and MCC scores for each 
algorithm. 
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Performance Metrics 


Figure 2. Fowlkes-Mallows Index and Matthews 
Correlation Coefficient 

NBCA is based on the assumption of 
independence between features, given the class 
label. This assumption simplifies the computation 
and allows for efficient classification. NBCA 
performs well when the independence assumption 
holds true or is close to being true in the dataset. It is 
particularly effective when dealing with large 
feature spaces and relatively small training datasets. 
The algorithm utilizes probabilistic calculations to 
estimate the likelihood of a data point belonging to a 
particular class. In this case, the moderate 
performance of NBCA, as indicated by an FMI score 
of 51.317 and an MCC score of 1.663, could be due 
to the limitations of the independence assumption or 
the dataset’s complexity, which may not align well 
with the assumption. 


RFCA is an ensemble learning method that 
combines multiple decision trees to make 
predictions. It generates multiple decision trees 
using bootstrapping and random feature selection, 
which helps in reducing overfitting and increasing 
robustness. The algorithm leverages the concept of 
majority voting or averaging predictions from 
individual trees to make the final prediction. RFCA 
can capture complex relationships between features 
and class labels, making it suitable for various 
datasets. The improved performance of RFCA, as 
indicated by an FMI score of 64.730 and an MCC 
score of 30.007, can be attributed to its ability to 
handle diverse feature interactions and generalize 
well to unseen data. 


Table 4. Fowlkes-Mallows Index and Matthews 
Correlation Coefficient Results 


Classifiers FMI MCC 
NBCA 51.317 1.663 
RFCA 64.730 30.007 

SABC-DQN 87.623 74.649 


SABC-DQN is a metaheuristic algorithm 
inspired by the behavior of honeybees. It optimizes 
the search space by combining exploration and 
exploitation strategies. DQN, on the other hand, is a 
reinforcement learning algorithm that utilizes neural 
networks to approximate the Q-function, enabling 
efficient learning and decision-making in sequential 
tasks. By combining ABC and DQN, SABC-DQN 
benefits from both the exploration and exploitation 
capabilities of ABC and the efficient learning and 
decision-making of DQN. The algorithm can 
effectively explore the search space and find optimal 
solutions while adapting to complex and dynamic 
environments. The superior performance of SABC- 
DQN, as indicated by an FMI score of 87.623 and an 
MCC score of 74.649, can be attributed to its ability 
to handle complex data patterns, optimize the 
classification process, and learn from environmental 
interactions. 


6. CONCLUSION 

The SABC-DQN (Sophisticated Artificial 
Bee Colony Inspired Deep Q-Networks) framework 
presents a novel and innovative approach for 
enhancing the sentiment analysis of Coursera course 
reviews. By integrating Artificial Bee Colony (ABC) 
optimization algorithms with Deep Q-Networks 
(DQN), SABC-DQN offers a unique mechanism to 
capture and understand the nuanced sentiments 
expressed in textual data. Experimental evaluations 
on a dataset of Coursera course reviews demonstrate 
the effectiveness and superiority of the SABC-DQN 
approach compared to existing sentiment analysis 
methods. SABC-DQN achieves higher accuracy, 
precision, recall, and Fl-score, showcasing its 
potential to improve sentiment analysis performance 
significantly. SABC-DQN exhibits robustness in 
handling variations in review length, domain- 
specific jargon, and grammatical errors. This 
robustness ensures the applicability and reliability of 
the approach across different domains and linguistic 
variations. The SABC-DQN framework opens new 
avenues for sentiment analysis, enabling a deeper 
understanding of learner opinions and attitudes in 
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online educational platforms like Coursera. Future 
research can explore further enhancements and 
extensions to the SABC-DQN approach, 
contributing to the advancement of sentiment 
analysis techniques and their application in various 
domains. 
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